Goal-based composition of scalable hybrid analytics for heterogeneous architectures
نویسندگان
چکیده
Crafting scalable analytics in order to extract actionable business intelligence is a challenging endeavour, requiring multiple layers of expertise and experience. Often, this expertise is irreconcilably split between an organisation’s engineers and subjectmatter domain experts. Previous approaches to this problemhave relied on technically adept users with tool-specific training. Such an approach has a number of challenges: Expertise— There are few data-analytic subject domain experts with in-depth technical knowledge of compute architectures; Performance — Analysts do not generally make full use of the performance and scalability capabilities of the underlying architectures; Heterogeneity — calculating the most performant and scalable mix of real-time (on-line) and batch (offline) analytics in a problem domain is difficult; Tools — Supporting frameworks will often direct several tasks, including, composition, planning, code generation, validation, performance tuning and analysis, but do not typically provide end-to-end solutions embedding all of these activities. In this paper, we present a novel semi-automated approach to the composition, planning, code generation and performance tuning of scalable hybrid analytics, using a semantically rich type system which requires little programming expertise from the user. This approach is the first of its kind to permit domain expertswith little or no technical expertise to assemble complex and scalable analytics, for hybrid onand off-line analytic environments,with no additional requirement for low-level engineering support. This paper describes (i) an abstract model of analytic assembly and execution, (ii) goal-based planning and (iii) code generation for hybrid onand off-line analytics. An implementation, through a systemwhich we call Mendeleev, is used to (iv) demonstrate the applicability of this technique through a series of case studies, where a single interface is used to create analytics that can be run simultaneously over onand off-line environments. Finally, we (v) analyse the performance of the planner, and (vi) show that the performance ofMendeleev’s generated code is comparable with that of hand-written analytics. © 2016 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). ∗ Corresponding author. E-mail addresses: [email protected] (P. Coetzee), [email protected] (S.A. Jarvis).
منابع مشابه
Design of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملHybrid Method of Logistic Regression and Data Envelopment Analysis for Event Prediction: A Case Study (Stroke Disease)
Abstract Predictive analytics is an area of statistics that deals with extracting information from data and using it to predict trends and behavior patterns. Many mathematical modeling has been developed and used for prediction, and in some cases, they have been found to be very strong and reliable. This paper studies different mathematical and statistical approaches for events prediction. The ...
متن کاملIntelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms
Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...
متن کاملTen Research Questions for Scalable Multimedia Analytics
The scale and complexity of multimedia collections is ever increasing, as is the desire to harvest useful insight from the collections. To optimally support the complex quest for insight, multimedia analytics has emerged as a new research area that combines concepts and techniques from multimedia analysis and visual analytics into a single framework. State of the art multimedia analytics soluti...
متن کاملTowards a multifrontal QR factorization for heterogeneous architectures over runtime systems
During the last decade, computer architectures for high performance computing have considerably evolved toward heterogeneous systems equipped with different types of computational units and a higher number of cores per chips. An example of popular heterogeneous architectures widely adopted in the high performance computing domain are GPU-based systems. In the work presented in this talk we stud...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Parallel Distrib. Comput.
دوره 108 شماره
صفحات -
تاریخ انتشار 2017